perm filename NSF[W88,JMC] blob sn#858883 filedate 1988-06-25 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00012 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	%nsf[w88,jmc]		1988 NSF Basic Research in AI Proposal
C00004 00003	\section{Introduction}
C00009 00004		All the formalized systems of non-monotonic reasoning
C00012 00005		The present proposal is for continuing a line of research in
C00018 00006		During the next three years we propose to explore the following
C00020 00007	\section{Formalization of Contexts}
C00030 00008	\section{Mental Situations}
C00033 00009	\section{Nonmonotonic Reasoning}
C00035 00010	\section{Heuristics}
C00041 00011	\smallskip\noindent{This draft of nsf[w88,jmc]\ TEXed on \jmcdate\ at \theTime}
C00042 00012
C00043 ENDMK
C⊗;
%nsf[w88,jmc]		1988 NSF Basic Research in AI Proposal
\input memo.tex[let,jmc]
\title{PROPOSAL FOR BASIC RESEARCH IN ARTIFICIAL INTELLIGENCE}
\section{Introduction}

	This is a proposal for renewed support of basic research in
artificial intelligence by John McCarthy and students.  We propose to add
Vladimir Lifschitz, a Senior Research Associate, to the proposal starting
in January 1990.  Most of the work is in the epistemological part of AI,
concentrating on developing formal methods of non-monotonic reasoning and
using them to express common sense knowledge and reasoning.  Also McCarthy
has recently returned to the subject of heuristics and their formal
expression, and work will continue in this direction, depending on how
successful it is.

NEEDS A SUMMARY OF WHY NON-MONOTONIC REASONING IS IMPORTANT.  IT CAN
BE TAKEN FROM  APPLICATIONS OF circumscription.

Previous Work and its Present Status

	(McCarthy 1977) contains the first published work on formalized
non-monotonic reasoning, introducing a version of circumscription that
is now mostly obsolete.  A conference on non-monotonic reasoning was
held at Stanford in 1978, and {\it Artificial Intelligence} devoted
as special issue to the subject in 1980 including (McCarthy 1980)
(McDermott and Doyle 1980) and (Reiter 1980).  The first paper introduced
predicate circumscription, the second introduced a now obsolete
``Non-monotonic Logic'' and the third introduced a ``Logic of Defaults''.
These papers started an active branch of research that has resulted
already in several hundred papers.  The 1984 AAAI sponsored conference
on non-monotonic reasoning was another landmark in the field.  (McCarthy
1986), which further developed circumscription, was based on a paper presented
at that conference.  Robert Moore's (198x) autoepistemic logic is
a newer significant non-monotonic formalism.  A recent international conference
in Grassau, West Germany, saw the introduction of several new
formalisms.

	The basic methods of non-monotonic reasoning have been elaborated and
varied by many people including
Vladimir Lifschits,
John McCarthy,
Michael Gelfond,
Donald Perlis,
Jack Minker,
Kurt Konolige,
Robert Moore,
Raymond Reiter,

Many people have also worked on computational aspects of non-monotonic
reasoning including Jon Doyle, Johann deKleer, Vladimir Lifschitz and
Michael Gelfond.
	All the formalized systems of non-monotonic reasoning
start from a collection $A$ sentences and infer further sentences.
However, the inferred sentences aren't always true in all models
of $A$.  There are three basic ways of determining what sentences
are inferrable.

	1. Some methods accept certain sentences because their
negations are unprovable.  In  Prolog, negation as failure
accepts a sentence $¬p$ when a certain standard effort to
prove $p$ has failed.

	2. The second method involves rules whose premise is that a certain
sentence has a model.  This is practical only when the axioms are such
that it is practical to determine whether a sentence has a model.  In
common sense reasoning, this is quite often the case.  Reiter's default
logic is based on this.

	3. The class of methods involve second order logic.  Circumscription
is only one of these methods.  Using second order logic is also only
practical in certain domains.

	Often the three methods agree on what is inferrable, but sometimes
one is more convenient than another.  For example, it is easier to
express knowledge in formalisms involving circumscription than in
the Horn clauses required by Prolog, but certain kinds of consequences are more
readily computable if the facts are represented by Prolog programs.  This
has led Gelfond and Lifschitz to study compiling from circumscription into
Prolog.  This is sometimes possible.
	The present proposal is for continuing a line of research in
artificial intelligence based on the use of mathematical logic.  It
started in the late 1950s.  We will first explain the general approach,
not assuming that the reader is familiar with previous papers, next the
results of the last few years, next the current scientific situation in
AI, and finally the problems we propose to study in the next three years.

\section{Mathematical Logic in AI}

	The mathematical logic approach to AI involves the following.

	1. The use of mathematical logical languages for expressing facts
about the world in the memory of a computer.  The most important kinds of
facts are those describing the current situations, the goals to be achieved
and the consequences of actions.

	2. The use of logical inference to reach conclusions, especially
conclusions about what the system should do to achieve its goals.  In early
years the only kind of logical inference that had been formalized was
logical deduction was available, but since the late 1970s
non-monotonic reasoning has been formalized in various ways and can
also be used.  The system tries to find an action strategy that it
can infer is appropriate to achieve its goals.

	3. The actions that can be performed include mental actions
such as inference and observation.  The logic approach will be
complete only when these actions and their effects are also described
by logical formulas and logical inference is also used to decide
what mental actions to perform.  Although the need for this was
discussed in (McCarthy 1960), the right formalism still hasn't been
found.

	A non-mathematical (enforced by the editor) discussion is in
(McCarthy 1988a) included with this proposal as Appendix A.

\section{Recent results of the logic approach}

	The logic approach has become more popular since the late 1970s
and many problems have been explored.  The biggest area of activity is
non-monotonic reasoning.  Besides circumscription described in (McCarthy
1980,1986), there are Reiter's logic of defaults, McDermott's
non-monotonic logic and Moore's autoepistemic logic.  Many authors, e.g.
Lifschitz, Gelfond, Clark, Doyle and DeKleer, have treated the methods of
carrying out non-monotonic reasoning in the computer.

	Two important areas of application of logical formalisms to
expressing common sense knowledge are expressing the facts about
the consequences of physical actions and expressing the facts about
people's knowledge, especially what can be inferred about what they
know or don't know on the basis of assumptions about their base
knowledge and what kinds of reasoning they can do.  The former
is treated, for example in (McCarthy 1970 and 1986) and Kowalski (1985).
The latter is treated in (McCarthy 1979) and (Halpern 19xx, etc.)

	Non-monotonic reasoning has permitted inferring the effects
of actions from much weaker premises than are needed when deduction
is the only tool available.  This is discussed in (McCarthy 1986),
where a formalism using {\it abnormality predicates} is introduced
in combination with circumscription.  Othere authors have made considerable
use of the abnormality predicates.  A difficulty with the non-monotonic
reasoning was independently discovered by Lifschitz and by Hanks and
McDermott.  This difficulty gives rise to unintended minimal models
of abnormality in the circumscription formalism and also to unintended
models when the logic of defaults is used.

	Several formalisms that get around the difficulties have
been discovered by Lifschitz (198x), Shoham (198x), Gelfond (198x), etc.
However, none of these solutions seems entirely satisfactory,
because they lack a sufficient degree of {\it elaboration tolerance}.

	During the next three years we propose to explore the following
ideas.

	1. Elaboration tolerance.

	2. Formalization of context.

	3. Intensionally universal logical languages.

	The logical languages used in studying mathematics and
used so far in studying AI share a common limitation.  The designer
of the language has a limited domain in mind, e.g. the natural
numbers or the blocks on a table and the actions that can be
carried out with the blocks.  

	2. Mental situations and hill climbing in mental situation space.

	3. Non-monotonic reasoning using increased reification of
mental entities.

	Consider the Yale shooting problem.  From the human point of
view, it is clear why the preferred model of the axioms.  A reason
why Fred should die is stated; he is shot.  No reason why the gun
should have become unloaded is given, although it might have.
\section{Formalization of Contexts}

	It is a commonplace that the meaning of any expression depends
on context.  This holds true just as much for the mathematical
logical sentences used in AI as for sentences of natural language.
However, to my knowledge context is usually treated informally
in the natural language and philosophical literature.  However, this
won't do for AI programs aimed at approaching human performance, because
it won't allow programs that reason about their own contexts.  Even
with lower performance goals it seems worthwhile to formalize
reasoning about contexts.

	We expect our formalization to have the following properties.

	1. Contexts comprise one of the sorts of individuals in a first
order language, probably supplemented by second order sentences and
reasoning used in circumscription.

	2. A basic predicate is $holds$ where $holds(p,c)$ asserts that
proposition $p$ holds in contexts $c$.  (Those familiar with situation
calculus should avoid being misled by a similarity of notation into
assuming that contexts are just a variant of situations.  That they are
different and much more general will become apparent in the following
paragraphs).  We also expect to use the more general $value(exp,c)$ where
$exp$ denotes an expression, but it is more difficult to figure out how to
do this right, because it isn't clear how to specify the range of
$value(exp,c)$.

	3. Contexts are ``rich'' entities, not to be fully described.
Thus the ``normal English language context of an American'' contains {\it
factual assumptions} and {\it linguistic conventions} that not all English
speakers will know.  Moreover, common not yet examined by anybody
philosophical assumptions are included in most contexts, but might be
transcended when an explicit effort is being made by man or machine.

	4. As an example of intended usage consider the formula
%
$$holds(on(cat,mat),c17),$$
%
which plays the role of asserting that a particular cat is on a
particular mat on a particular occasion.  $c17$ carries information
identifying $cat$ with a particular cat and also identifies the
meaning of $mat$ and incorporates the occasion.  The occasion may
be real or may be fictitious, e.g. from the Sherlock Holmes stories.
It would be a convention of $c17$ that a time doesn't have to be
specificed in this sentence, although $c17$ might be a context that
requires that some other sentences specify times.  Even the use
of predicate calculus notation is just a convention of $c17$.
Thus we might have $holds(``The cat is on the mat.'',c18)$,
where $c18$ is a context with the same factual assumptions as
$c17$ but with different linguistic conventions.

	5. The above-mentioned $c17$ would be a rather specialized
context.  Assertions made in $c17$, i.e. sentences of the form
$holds(p,c17)$, are connected to less specialized contexts by
sentences of our context language.  For example we might have
%
$$meaning(cat,c17) = meaning(``John McCarthy's cat'',c9)$$
%
where $c9$ is a context in which John McCarthy is identified with
a particular person, but a reference of $cat$ with a particular
beast is not.  $c9$ would also be a context in which English
phrases are meaningful.

	6. There is no most general context.  The reason for this
is that the context language needs to allow the possibility of
creating new contexts generalizing aspects of the most general
previous context.

	A convenient example is provided by a variant of the Searle
(Searle 198x) ``Chinese room'' story.  In this variant the man keeps all
the information about how to respond to sequences of Chinese characters in
his head.  Searle's confusion results from identifying a personality with
the body it occupies.  This is a convention of the English language and
his useful and harmless as long as each body is inhabited by a single
personality.  Explaining the ``Chinese room'' story requres distinguishing
the personality conducting the Chinese dialog and which understands
Chinese from the ``normal personality'' of the man and which doesn't
understand Chinese.

	We plan to develop operations that specify more general
contexts from the contexts already used.  The Chinese room
 example requires getting a context that uses different names
for the minds and bodies from some ``normal English context''
that identifies them.

	7. We give one example of a rule relating contexts which
certain sentences are explicitly indexed by times and a more-specialized
context in thich they are not.
%
$$\displaylines ∀pct (holds (p, spec-time (st))≡ holds (at (t,p), c))$$
%
Most like such rules will have to be stated within contexts, but we haven't
studied how to do this yet.

	8. The context language will include notions of entering and leaving
a context which generalize natural deduction.  Suppose we have
%
$$holds (p,c)$$.
%
If we write
%
$$enter\ c,$$
%
as a command, we may then write $p$ by itself.  If we subsequently infer
$q$, we have allowed to write
%
$$holds (q,c)$$.
%
When we have entered a context there will be restrictions on inferences
analogous to those that apply when an assumption is made in natural deduction.

	One increased generality over natural deduction is that 
$holds (p,c1)$ and $holds (not p,c2)$ behave differently from formulas
like $c1 ⊃ p$ and $c2 ⊃ ¬p$.  For example, if
$c1$ is associated with the time 5 pm and $c2$ with 6 pm, and $p$ is
$at(I, office)$, then $holds(p,c1)$ and $holds(not\ p,c2)$ might be used (with
other premises) to infer that I went home between 5 pm and 6 pm.  $c1 ⊃ p$ 
and $c2 ⊃ ¬p$ cannot be used in this way, because the logic has to
regard $p$ as expressing a single proposition that is either true or false.

	9.  Perhaps it will be convenient to start the reasoning done 
by the program by assuming that a certain rather general context has been
entered.

	The major goals of the research are to determine the rules that
should be used to relate contexts to their generalizations and
specializations.  Many of these rules will be nonmonotonic.
\section{Mental Situations}

	We intend to study the AI applicability of the situation calculus
model whose basic formula is
%
$$s' = result(e,s),$$
%
where $s$ is a situation and $e$ is an event to mental situations and
mental events.  Mental events include external observations, deductions,
steps of nonmonotonic reasoning, and also observations of the mental
situation itself.  A mental situation is a generalization of a database
in which all sentences have their full pedigrees in the database.  As a
database it includes facts about the extenal world, goals in the
external world, facts about what it does and doesn't know and mental
goals including the goal of determining a $y$ satisfying some
predicate $P(x,y)$.

	Here are the reasons for introducing mental situations.

	1. We hope to use them in the control of reasoning.  If we
regard reasoning steps as mental actions, we can prove properties
of reasoning algorithms in the same way we prove properties of
other programs of action.  In the mental situation, we want to
prove the appropriateness of reasoning strategies for achieving
mental goals.

	2. The limitations of the situation calculus, e.g. to
discrete non-concurrent events, arise less in thinking than in
external action.

	3. 
\section{Nonmonotonic Reasoning}

	Circumscription (McCarthy 1977, 1980, 1986) has proved to be a
very popular form of nonmonotonic reasoning.  The logic of defaults
(Reiter 1980) and autoepistemic logic (Moore 198xx) are also actively
being developed and applied.

	The new forms of nonmonotonic reasoning and the new formalizations
using old forms are aimed at solving problems that have arisen in
treating certain standard examples, e.g. inheritance hierarchies and
the Yale shooting problem (McDermott 198xx).

	We plan to explore the idea of making nonmonotonic reasoning
depend more on subjective considerations.  These considerations include
the goals and the pedigrees of the facts being taken into account.  One
way of doing this involves mental situations, as discussed in the
previous section.  Another involves reification, e.g. treating
beliefs as objects.
\section{Heuristics}

	McCarthy has renewed his interest in what might be called
{\it qualitative hill climbing}.  Ordinary hill climbing by program
uses a numerical evaluator of some kind or a measure of the distance
from a goal, perhaps with the resources used so far subtracted.
Its biggest problem is the existence of local optima, and special
methods often must be devised to get off the local optima.

	Qualitative hill climbing doesn't use numerical evaluation.
Instead it uses a predicate $better(p1,p2)$ that can sometimes say
that position $p2$ is better than position $p1$.  The strategy uses
breadth first search or iterative deepening in order to find a
path to a position that is better than the base position.  It moves
along that path to the new position and starts over.

	The $better$ predicate should have two properties.  First, if
the position keeps getting better, then it should lead to achieving
the goal in a reasonable number of stages of improvement.  Second,
the amount of lookahead required to find a better position should be
reasonable.

	A numerical measure of the value of a position leads trivially
to a $better$ predicate, i.e. we define
%
$$better(p1,p2) ≡ value(p2) > value(p1).$$
%
However, a $better$ predicate is often easier to find since it is
merely a predicate whose transitive closure is an inductive partial
ordering.  A numerical predicate permits any two positions to be
compared, but this is often unnecessary and doesn't correspond to
what people do.  Thus someone who can play chess well or solve
Rubik's cube can't give a numerical value to arbitrary positions
or even tell which of two arbitrary positions is better.  His ability
corresponds more to being able to compare a base position to certain
nearby positions and sometimes conclude that the nearby position
represents an improvement.

	In Fall 1987 the idea was tried out by David Throop, graduate
student at the University of Texas at Austin, and it worked quite well
for the 15 puzzle.  It was quite easy to devise a $better$ predicate
that corresponded to our intuitive notions of what constitutes an
improvement in the 15 puzzle and which worked in the LISP program.
  However, the work with the 15 puzzle still involved
the computer doing much more computation to improve a position than
a human would do.  The improvement required is to compare not just
positions but equivalence classes of positions.  Doing so corresponds
to human practice and enormously reduces the search.

	Actually the $better$ predicate was supplemented by a $worse$
predicate.  When the lookahead encounters a position worse than the
base position, it backtracks immediately and doesn't look at the
position's successors.  This also corresponds to human heuristics,
and it greatly reduces the amount of search required to find a better
position.

	The idea of the $better$ and $worse$ predicates applies also to
games like chess.  Then the search has to find a way to force an
improvement in the position.  It was tried by McCarthy's student Barbara
Liskov (then Huberman) in her 1968 Stanford PhD thesis.  With some
elaboration it worked well for the classical endgame checkmates with a
rook, with two bishops, or with a bishop and knight.  However, it now
seems more effective to explore the idea and its modifications in simpler
problems.

	We plan to explore qualitative hill climbing further.
\smallskip\noindent{This draft of nsf[w88,jmc]\ TEXed on \jmcdate\ at \theTime}
\vfill\eject\end

topics:

Second order logic and the model theory of first order logic.

contexts

nonmonotonic logic

haven't succeeded yet

more reification